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THE MAXIMUM OF A RANDOM WALK REFLECTED 
AT A GENERAL BARRIER 

By Niels Richard Hansen 

University of Copenhagen 

We define the reflection of a random walk at a general barrier and 
derive, in case tlie increments are liglit tailed and have negative mean, 
a necessary and sufflcient criterion for the global maximum of the 
reflected process to be flnite a.s. If it is flnite a.s., we show that the tail 
of the distribution of the global maximum decays exponentially fast 
and derive the precise rate of decay. Finally, we discuss an example 
from structural biology that motivated the interest in the reflection 
at a general barrier. 

1. Introduction. The reflection of a random walk at zero is a well-studied 
process with several applications. We mention the interpretation from queue- 
ing theory — for a suitably defined random walk — as the waiting time until 
service for a customer at the time of arrival; see, for example, [1]. Another 
important application arises in molecular biology in the context of local com- 
parison of two finite sequences. To evaluate the significance of the findings 
from such a comparison, one needs to study the distribution of the locally 
highest scoring segment from two independent i.i.d. sequences, as shown in 
[8] , which equals the distribution of the maximum of a random walk reflected 
at zero. 

The global maximum of a random walk with negative drift and, in par- 
ticular, the probability that the maximum exceeds a high value have also 
been studied in details. A classical reference is [3], Chapter XI. 6 and page 
393. The probability that the maximum exceeds level x has an important 
interpretation as a ruin probability — the probability of ultimate ruin — for 
a company with initial capital x. It also turns out that the distribution of 
the global maximum coincides with the time invariant distribution for the 
reflected random walk; see [1]. 
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In this paper we deal with a situation somewhere in between the reflec- 
tion at zero and the unreflected random walk. We define the reflection of 
the random walk at a general (negative) barrier. Then we study the global 
maximum of the reflected process in case it is finite. This has an interpre- 
tation in the context of aligning sequences locally as introducing a penalty 
on the length of the initial unaligned part of the sequences. We discuss this 
type of application in greater detail in Section 4. 

We consider only random walks with light tailed increments, that is, in- 
crements for which the distribution has exponential moments. The main 
result is Theorem 2.3 stating that, if the global maximum is finite, the tail 
of the distribution of the global maximum of the reflected process decays 
exponentially fast with the same rate as for the global maximum of the or- 
dinary random walk. The difference is a constant of proportionality, which 
we characterize. 

Let {Xn)n>i be a sequence of i.i.d. real- valued stochastic variables defined 
on (r2,jF, P) and define the corresponding random walk {Sn)n>o starting at 
by = and for n > 1 , 

n 

Sn = ^ ■ 

k=l 

The reflection of the random walk at the zero barrier is the process 
(Wn)n>o defined recursively by Wq = and for n > 1, 

(1) Wn=mSix{Wn-l+Xn,0}. 

A useful alternative representation of the reflected random walk is 

(2) Wn = Sn- min Sk, 

0<k<n 

for which the r.h.s. is easily verified to satisfy the recursion (1). 

The purpose of this paper is to investigate the reflection at a general, 
possibly curved, barrier. Assume therefore that a function 

9:N^(-oo,0] 

is given and deflne the process {W^)n>o by Wq = and recursively for n > 1, 
by 

(3) W3 = m^x{W^_, + X^,g{n)}. 

We call {Wfl)n>o the reflection of the random walk at the barrier given by g. 
It satisfles Wf[ > g{n) and Wf( > Sn for all n, and it is, like the reflection 
at zero, a Markov chain, though, in general, a time-inhomogeneous Markov 
chain. For g = 0, we obtain the reflection at zero, but we are more interested 
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in the situation where g{n) ^ — oo for n ^ oo. Observe that a representation 
similar to (2) is possible, 

(4) = Sn- min {Sk - g{k)} = 5„ + max {g{k) - Sk}, 

0<k<n 0<k<n 

which is seen by verifying that the r.h.s. of (4) satisfies (3). We will prefer 
the second representation in (4). 

2. Results. We state the results obtained in this paper as Theorem 2.1 
and Theorem 2.3. The proofs are given in the next section. To state the 
results we need a few assumptions and definitions. 

We will assume that the Laplace transform (j){6) = K{exp{9Xi)) is finite 
for 6 in an open interval (a, b) containing 0. In particular, Xi has mean 
value, which we denote by ;U = E{Xi) = d0(j){O). We will assume that fi<0 
and that (j){6) — > oo for 6 ^ b. In this case there exists a solution 9* > to 
the equation (j){6) = 1, which is unique due to convexity of (p. The stochastic 
process (L* )„>o defined by 

L* = exp(6'*5„) 

is a positive martingale w.r.t. the filtration {J-n)n>o generated by the X- 
process {To = {0,0}). Furthermore, E(L*) = 1 so L* defines a probability 
measure P* on with Radon-Nikodym derivative L* w.r.t. the restric- 
tion of P to J-n- Letting P* denote the set function defined on the alge- 
bra Un-^n its restriction to Tn being P*, then P* is, in fact, a prob- 
ability measure on the algebra and it has a unique extension to J-oo = 
(7(Un-^n) — the least cr-algebra generated by the filtration; see [10], Sec- 
tion 1.5. The probability measure P* is called the exponentially changed 
or tilted measure. That P* is a probability measure and, in particular, that 
it is fj-additive, can be seen as follows. Let i' = Xi(PJ) denote the distribu- 
tion of Xi under PJ, then, for F G [Jn^n, there exists B € B®^ such that 
F = {{Xn)n>i G B), hence, P*(F) = where i/®^ is the infinite prod- 

uct measure on (R^,B®^). Then cr-additivity of P* on [Jn^n follows from 
<T-additivity of u^^. In addition, we see that, under P*, the stochastic vari- 
ables {Xn)n>i are i.i.d. with mean fi* = E*(Xi) = dg(j){9*) > 0. A more gen- 
eral treatment of exponential change of measure techniques can be found in 
[1], Chapter XIII. 

We denote the maximum of the random walk reflected at the barrier g by 

(5) M'^ = supW9 

n 

and we define 

(6) D = sup{g{n) - Sn}, 

n 

which may be infinite with positive P-probability, but D is always P*-a.s. 
finite. 
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Theorem 2.1. It holds that 

(7) ¥{M^ >u)< exp{-e*u)E*{exp{e*D)), 
and F{M^ < oo) = 1 if and only if 

(8) E*(exp(rL>)) <oo. 
Moreover, with g{0) = 

oo 

(9) E*(exp(rD)) < J2 exp(rg(n)). 

n=0 

Remark 2.2. The second inequality provides us with an apphcable, 
sufficient criterion for ahnost sure finiteness of A4^, namely, 

oo 

^exp(r5r(n)) < oo. 

n=l 

Interestingly, this infinite sum and the corresponding finiteness criterion 
occurred in [9] in the analysis of local sequence alignment. In their setup g 
denotes a gap penalty function. 

The ascending ladder height distribution of the random walk {Sn)n>o 
under P* is defined by 

GX{x)=F*{Sr+<x), 

with r+ = mi{n > 0\Sn > 0}. Note that since n* > 0, it follows that t+ < oo 
F*-a.s. so that G*^ is a well-defined probability measure. 

Theorem 2.3. If (8) holds, if the distribution of Xi is nonarithmetic, 
and if B is a stochastic variable with distribution 



(10) '■^''^^> = wk7)l''-°'-^''>'"'' 

then 

(11) F{M^ > n) ~ exp{-e*u)E*{exp{e*D))E*{exp{-e*B)) 
for li — 5- oo . 



Remark 2.4. The stochastic variable D has an alternative representa- 
tion. Define the sequence of stopping times (r£(n))n>o by t^{0) = and for 
n > 1, 

T^in) = mi{k >T^{n- l)\Sk - g{k) < 5,«(„_i) - g(r^ (n - 1))}. 
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For T^{n) < oo, define, in addition, the corresponding "undershoot" by 
?7,, = 5r(r£(n)) - 5((r£(n - 1)) + - 5^9 („). 

Since fi* > 0, it holds that Sn +oo P*-a.s., and we have that T^{n) = oo 
eventually P*-a.s. If we define p = inf{/c|r£(/c) = oo} — 1, then 

k=l 

Remark 2.5. If we assume that the distribution of Xi is arithmetic 
with span 5, say, the random walk is restricted to the lattice (5Z, but the 
reflected process may be pushed out of the lattice by the reflection. The best 
result obtainable for a general g is then 

E*(exp(0*Z)))E*(exp(-e*S))exp(-r(5) 

< liminf exp(6'*'u)P(7\/[^ > u) 

ji— ►oo 

< limsupexp(6l*n)P(A^^^ > u) 

u— »oo 

< E*(exp(^*Z)))E*(exp(-0*S)). 

However, if g takes values in only, (11) holds provided that u — > oo within 
(5Z. 

Example 2.6. The linear barrier g[n) = —an for a > is particularly 
simple to handle. First we find that 

oo 

exp(— 0*an) < oo 

n=l 

and it follows from Theorem 2.1 that M.^ < oo almost surely. Moreover, 
from (6) we obtain that 

D = sup{— an — Sn}, 

n 

SO D is, in fact, the maximum of a random walk (S'„,)n>o with increments 
— Q — Xn for n > 1. The distribution of D can be found explicitly in terms 
of the ascending ladder height distribution for (S'.„)„>o. That is, with (5+ 
denoting the (defective) ascending ladder height distribution given by 

G+(x)=P*(5~^<x,r+<oo), 
where r+ = inf{?i > Ol/S^ > 0}, we have that 

oo 

P*(D <x)= P*(t+ = oo) ^(G'+)*"(x), 

n=0 
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see Theorem VIII. 2. 2 in [1]. Note that this representation is, in fact, equiva- 
lent to the representation D = Yl,k=i in Remark 2.4, since we can identify 
the ascending ladder epochs for (5'„)„>o with the stopping times (r£(n))„>o. 
The conclusion is that D is a sum of a geometrically distributed number of 
i.i.d. variables each with distribution P*(t+ < oo)~^Gj^. 

Example 2.7. With g{n) = —plogn for p > 0, we get an interesting class 
of barriers, for which the maximum Ai^ is finite or infinite a.s. according to 
whether p > 1/6* or p < 1/6* . Indeed, we observe that 



max 

0<m< 



; W^^= max \Sm+ max {-plogk - Sk}> 

n 0<m<n l<A;<m J 



> max < Sm — min Sh> — p log n 

0<m<n l<A:<m ' 

= max Wm — p\ogn, 

0<m<'n 



where (Wn)n>o is the reflection at zero. Since 

max Wm - -^logn 

0<m<n 6 

converges in distribution [7, 8] (in the arithmetic case, the sequence is tight), 
we get for p < 1/6* that = oo a.s. On the other hand, we find that 

oo oo 

J2exp{-6* plogn) = J2n-''', 

n=l n=l 

which is finite precisely when p > 1/6* . Hence, for p > 1/6* , it follows from 
Theorem 2.1 that A4^ < oo a.s. and Theorem 2.3 holds. 



3. Proofs. The proofs of Theorems 2.1 and 2.3 are based on the expo- 
nential change of measure technique as introduced in the previous section. 
We briefly review how this technique is used to obtain similar results for 
the maximum of an ordinary random walk. For more details, we refer to [1], 
Sections XIII.3 and XIII.5. 

We first observe that, for any stopping time r and any J^,-- measurable, 
positive stochastic variable Y, it holds that 

(12) E*{Y;t <oo)=K{Y LI; T <oo). 

This follows easily by (r < oo) = lJn(''" — ^^d that P* has Radon-Nikodym 
derivative L* w.r.t. P on Tn, see also [1], Theorem XIII. 3. 2. A useful conse- 
quence for Y = exp{—6*Sr), in which case YL* = 1, is that 



(13) 



E*{exp{-6*Sr);T < oo) = P(t < oo). 
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We let 

= sup Sn 

n 

denote the global maximum of the random walk, which is finite due to the 
negative drift under P. Defining t{u) = inf{n > 0\Sn > u} and using (13), 
we get, since P*(r(n) < oo) = 1 due to the positive drift of the random walk 
under P*, that 

F{M >u)= P(r(n) < oo) = E*(exp(-^*5^(„0) 

(14) 

= exp(-rn)E*(exp(-r (S,(,) - n))). 
For the overshoot of level u at time t{u), it holds that 

(15) Sr(u) - u^B 

under P* for u — > oo, see Theorem VIII. 2.1 in [1], with B a stochastic variable 
with distribution given by (10). (If the distribution of Xi is arithmetic with 
span (5, say, the limit has to go through multiples of 5.) This implies that 

¥{M >u)^ exp(-e*u)E*(exp(-e*S)) 

for w — > oo. 

Proof of Theorem 2.1. Introduce the stopping time 

(16) r^(u) =inf{n>0|W;^ >n} 

for u > 0. Since > Sn and 5^ ^ oo P*-a.s., we find that P*(Tf (n) < oo) = 
1 and (13) gives that 

P(7W^>n)=P(rf(n)<oo) 

(17) =E*(exp(-r5,«(„))) 

= exp(-0*n)E*(exp(-e*(5,s(„) - u))). 

Write S^g — n as 

St3{u) -U = -U- {W%i^^-^ - Sr<3[u)) =Bu-Du 

with = W^,^^^ - Sr^iu) and B^ = W%^^^ - n > 0. 

It follows from (4) that Du < D. Especially, since exp(— 0*i?u) < 1, 

¥[M^ >u)< exp(-6l*n)E*(exp(6l*Z)„)) 

<exp(-rM)E*(exp(^*L>)) 

and (7) follows. If W {eyiY>{e* D)) < oo, this implies that ¥{M^ < oo) = 1. On 
the contrary, if ¥{A4^ = oo) > 0, it follows that 

e^p{e*u)F{M^ = oo) < E*(exp(0*Z))) 
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with the l.h.s. tending to infinity as n ^ cxo so E*(exp(^*L')) = oo. 

The second part of the proof consists of verifying (9). By partial integra- 
tion, 

/oo 
0*eKp{e*u)F*{D>u) du. 
-oo 

Introducing 

f (it) = inf{n > 0\g{n) — Sn> u}, 
an apphcation of (12) with Y = 1 yields 

F*{D >u)= F*{f{u) < oo) = E(exp(0*S~(„))). 

Hence, 



/oo 
-C30 

°° / roo 

= ^E exp(0*S„) / e*exp{e*u)l{f{u)=n)du 

n=0 ^ -^-^ 

To bound the inner integral, we introduce for n > the variable 

Un = sup{n|r(ti) = n} 

with the usual convention that the supremum of the empty set is — oo. By 
definition, 

fif(T(u)) - 5~(„) > u, 

hence, for all u with f{u) = n, it holds that Sn + u < g{n) and, in particular, 
(18) Sn + Un<g{n). 

A moments reflection should convince the reader that we have equality when- 
ever [/„ > — oo, but the inequality holds for all n. Since 

/oo rUn 
9*exp{e*u)l{T{u) = n)du< / 6* exp{9*u) du = exp{e*Un), 
-oo J —oo 

we obtain, using (18), the inequality 

oo oo 

E*(exp(rD)) < E(exp(r (5„ + Un))) < ^ exp(r <7(n)). 

n=0 n=0 

The proof of Theorem 2.3 relies on the following lemma. 
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Lemma 3.1. WithT^{u) defined by (16), then if Xi is nonarithmetic, if 

is a bounded, continuous function, and if B is a stochastic variable with 
distribution given by ( 10 ), it holds that 

(19) E*{hiW^,^^^-u)\J'r,^^/,))^E*h{B) 
for u — > CX3 . 

Proof. First we make a general observation. If X and X' are two iden- 
tically distributed stochastic variables that take values in a space E, if G 
is a (T-algebra such that X' is independent of t?, if y is a ^-measurable, 
real valued stochastic variable, and, finally, ifA;:£^xM— >Risa bounded, 
measurable function, then with 

H{u)=E{k{X, u)), 

it holds that 

(20) E{k{X',Y)\g) = H{Y). 

For convenience, extend h to be defined on M by h{u) = for n < 0. 
Let E = E^, X = {Xn)n>i, X' = (-'^rs(«/2)+n)n>i, = -^t9{m/2) and Y = 
u — 1^^9(u/2)' Obviously X and X' have the same distribution, X' is inde- 
pendent of g (under P*) and Y is Q measurable. Recalling the definition 
t{u) = inf{n > 0|5„ > u} and defining a{u) = inf{n > 0| J2k=i ^T3{u/2)+k > 
u - 1^^9(„/2)}, it follows from (20) that 



'(j{u) 

•^r«K2) ) =^('"-^rS(«/2))' 



where 

F(tx)=E*(M5,(„)-tx)). 

From (15) it follows that fl{u) E*{h{B)) for u ^ cxd with the distribu- 
tion of B given by (10). Note that here we use the nonarithmetic assumption 
for this limit to hold when u ^ oo arbitrarily. Since 

< -U = Sr3(u) -U + Du< Sr{u) -U + D, 

where the r.h.s. is P*-tight due to (15), we find that 

" - nW2) = V2 - (^^.^.(„/2) - u/2) ^ oo. 



We conclude that 

'K2) 



(21) H{u-W^,,,,.)^E*{h{B)). 



10 



N. R. HANSEN 



Recall that 

Du = W%^^^^ - = max {g{k) - Su] 

and note that since t^{u) — > oo P*-a.s. for u — > oo, it follows that = D 
eventually with P*-probability one. Letting = {Du/2 = D), then 1{K^) 
for u ^ oo P*-a.s. and, in particular, P* (K^) for u oo. On the 
event Ku it holds that r^(n) = t^{u/2) + (7{u) and that W^^9(„/2) ~ ^t!){u/2) = 
Du/2 = Du = W^g(^u) ~ 'S'r9(«)- In particular, on 

a(u) 

^r9{u/2)+k = Sr9(u) " 'S'r9(n/2) = W^rS(n) ~ ^t9{u/2)- 

k=l 

Then 

<2||/i|looP*(i^^)^0 
and this together with (21) completes the proof. □ 

Remark 3.2. It follows from Lemma 3.1 that 

(22) W%^^^-u^B 

for u ^ CO under P*. This is a well-known result from nonlinear renewal 
theory, see [11], Theorem 9.12 or [13], Theorem 4.1. The condition that 
needs to be fulfilled is that the difference 

- Sn = iTiax {g{k) - Sk} 

l<k<n 

must be slowly changing, which is indeed the case since it is P*-a.s. con- 
verging to a finite limit. If we use (22) in the proof above, we can avoid the 
tightness argument. 

Proof of Theorem 2.3. We use notation as in the proof of Theorem 
2.1. From (17) we have that 

(23) F{M^ >u)= exp{-9*u)K*{exp{-0*Bu) exp{0*Du)). 

With Ku = {Du/2 = D), then since Du/2 = Du on Ku, since Bu > 0, and 
since Du < D, we see that 

E* I eM-d*Bu) exp{9*Du) - exp{-9*Bu) exp{9*Du/2)\ 
<E*(exp(rZ))l(KO)^0, 
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using dominate convergence and that P*-a.s. as noted in the proof 

of Lemma 3.1. Since exp((9*L'„/2) / exp(6'*I?) with E*(exp(6'*L')) < oo, by 
assumption, it follows from Lemma 3.1, using that exp(—9*Bu) < 1, that 

exp(rL>„/2)E*(exp(-^*Sj|.7^,«(„/2)) ^ exp{e*D)E*{exp{-e*B)). 
Collecting these observations yields 

lim E*(exp(-rS„)exp(rL>„)) 

u — >oo 

= lim E*{exp{-6*Bu)exp{e*Duf2)) 

= lim E*(exp(ri)„/2)E*(exp(-rS„)|j;«(„/2))) 

= E*(exp(0*Z?)E*(exp(-rB))) 

= E*{ex.-p{9*D))E*{exp{-6*B)). □ 

4. An application to structural biology. An interesting application of 
the random walk reflected at a general barrier arises when trying to mea- 
sure whether certain structural features are present in an RNA-molecule. An 
RNA-molecule is built from four building blocks — the nucleotides — denoted 
a, c, g and u. They are connected in a linear sequence, and a typical 
representation of an RNA-molecule IS ctS 3) string of letters, for example, 
aaggaacaaccuu. These molecules are, furthermore, capable of forming hy- 
drogen bonds between nonadjacent nucleotides, which makes the molecule 
fold into a three-dimensional structure. The hydrogen bonds are usually 
(and energetically preferably) formed between Watson-Crick pairs, that is, 
between a and u and between c and g. For the short sequence above, it is 
evident that we can pair up the first four letters, aagg, with the last four 
letters inverted, uucc, to form Watson-Crick pairs — leaving the five letters 
aacaa unpaired, see Figure 1. 

A real example is shown in Figure 2. That molecule, which belongs to 
a class of small RN A- molecules known as microRNA (miRNA), is still of 
rather moderate length. Many RNA-molecules are larger and form more 
complicated structures, but the essential building blocks are always groups 
of adjacent Watson-Crick pairs similar to the structure shown in Figure 2. 

a 

ttff ^ 
uucc a 
a 

Fig. 1. A schematic picture of a structure formed by the example RNA-molecule 
aaggaacaaccuu. The vertical lines pairing up letters represent hydrogen bonds between 
the corresponding nucleic acids. The segment of five letters at the r.h.s. is called a loop. 
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gug ua ag c gcccauacuauau 

accg CCS cugcauacuuc uuacau c 

HIT iiT IITIIIMIII mill a 

uggu ggc gauguaugaag aaugua u 
uga gg aa a agguauagguaaa 

Fig. 2. An RNA-molecule from the nematode C. elegans known as mir-1, which form a 
structure by Watson-Crick pairing. A few non-Watson-Crick u-g pairs are also formed 
in this structure. There is a large loop to the right consisting of unpaired letters [2] . 

One can suggest the following procedure to search for the local occurrence 
of structures within a possibly much longer sequence y = yi, . . . , Pick a 
pair {yi,yj), say, with i < j and compute, for 1 <m< min{i — l,n — j} 

m 

S'd = ^f{yi-k,yj+k) 

k=0 

for some score function /. The score function could, for instance, take the 
values +1 for Watson-Crick pairs and —1 otherwise, but for the present 
section, it is only important that the mean score under the random model 
introduced below is negative. We search for high values of S^^ as this implies 
a high number of (rather coherent) Watson-Crick pairs. However, if j — i is 
large, there is a large loop in between the letters that pair up nicely, and 
this is not reasonable. Therefore, we introduce a penalty function ^ : No ^ 
(—00,0] [assume, for convenience, that ^(0) =^(1) = 0] and define 

Miy) = max{5^^' + g{j - i - 1)}. 

If Y = Yi, ... ,1^ is a finite sequence of i.i.d. stochastic variables taking 
values in {a, c, g, u}, we are, for example, computing P(A4(Y) > u). What 
we will show here is that Ai(Y) is, in fact, the maximum of partial maxima 
of (dependent) random walks reflected at barriers given in terms of g. 

Assume first that j — i is even and let hq = i + {j — i)/2. Define 

gi{k)=g{2k + l) 

together with 

w^l^= uuxx \g,ik)+ £ /(y„„_;,y„„+o|- 

We observe that, for each no, 

max S'^ + g{j - i) = max W^^ 

i,j,m:(j—i)/2=no '"■ 

With = f(YnQ_i,YnQ+i), which for fixed no are i.i.d. variables, we observe 
that 



m 



l=k+l 
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(771 



= max max \gi{k)+ ^ X"^^^ \ + X^^ g,{r 

= max{M^4\^_i + X;;,o,gi(m)}. 

That is, the process (W^jJ ,„)m>o is a random walk reflected at the barrier 
given by (7i. A completely analogous derivation can be carried out if j — i is 
odd using the reflection barrier 

g2{k) = ~g{2k), 

which for uq = i + (j — i — l)/2 leads to the reflected random walk 



fulfilling 



™ax S'^ +g{j -i)= max W^l^ . 



ij,'!^: {j-i-l)/2=no 

This shows that Ai(Y) is indeed the maximum of partial maxima of reflected 
random walks. 

Let iXn)n>i be i.i.d. with Xi = fiYi,Y2) and E{Xi) < 0, and let 

A = sup< gi{n) 

[ k=i ) 

together with 

K* = (E*(exp(rL'i)) +E*(exp(rD2)))IE*(exp(-rS)). 
Using Theorem 2.3, we arrive at the approximation 

(24) E (j2 1 (max W^^^ > + 1 (max W^^^ > u^j ^ nK* exp(-r n) 

for n,u suitably chosen. Note that there are two approximations here. First 
we approximate the partial maxima of the random walks with the global 
maxima, and then we use Theorem 2.3 to approximate the tail of the dis- 
tribution of the global maxima. Admittedly, we have ignored the arithmetic 
nature of the variables X„. It is beyond the scope of this paper to deal 
thoroughly with the distribution of A4(Y). A Poisson approximation of the 
stochastic variable 

J2 1 (™^x > + 1 (max W^l^, > 

no ^ ^ ^ 

that also provides a formal justification of (24) can be found in [5]. Such a 
Poisson approximation shows, in addition, that 

(25) F{M{Y) > n) ~ 1 - exp{-nK* exp{-9*u)). 
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gug ua ag c gc au 

accg ccg ctigcauacuuc uuacau ccaua cuau c 
I I IT III I ITI I I I I I I I I I I I I I I I I I I I I I I a 
liggu ggc gauguaugaag aaugua gguau ggua tt 

uga gg aa a a a aa 

Fig. 3. An alternative structure of mir-1 from Figure 2. It has more Watson- Crick pairs 
that are obtained by allowing nucleotides to be skipped. 

5. A final remark. In terms of the indices, the structures considered in 
this paper take the form 

- 1, J + 1), •••,(« - m,j + m). 

It should be remarked that real RNA-structures are more complicated, and 
one would, for instance, also consider structures of the form 

(n,il),(^2,i2),---,(^m,im) 

with im < im-i < ■ • ■ < h < ji < ■ • ■ < jm- Such structures allow for nu- 
cleotides in the sequence to be skipped, see Figure 3. Finding the optimal 
score, with a suitable penalty on skips, over such more general sets of struc- 
tures constitutes a combinatorial optimization problem that can be solved 
rather efficiently by dynamic programming techniques. A theoretical under- 
standing of the distributional behavior for the resulting optimal score seems, 
however, to be a challenging problem. The development for the similar prob- 
lem of local sequence alignment, see [4, 12], illustrates some of the difficulties 
that arise. 

Admittedly, the present paper makes no attempt to handle the general 
problem with skips, nor can we expect that the presented results about 
reflected random walks can contribute much to solving that problem. How- 
ever, we do illustrate in the simple case with no skips how the introduction 
of a hairpin- loop penalty can affect the optimal score, as indicated by (25), 
when compared to no hairpin-loop penalty; see [6]. One important differ- 
ence is that n enters linearly in (25), whereas n enters quadratically in the 
corresponding result with no hairpin-loop penalty. 

Acknowledgments. The author thanks the referees for pointing out sev- 
eral places where additional details made the arguments more transparent. 
Thanks are also due to an Associate Editor, who provided some valuable 
suggestions. 
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